Introducing visual cues in acoustic-to-articulatory inversion
نویسنده
چکیده
The contribution of facial measures in a statistical acoustic-toarticulatory inversion has been investigated. The tongue contour was estimated using a linear estimation from either acoustics or acoustics and facial measures. Measures of the lateral movement of lip corners and the vertical movement of the upper and lower lip and the jaw gave a substantial improvement over the audio-only case. It was further found that adding the corresponding articulatory measures that could be extracted from a profile view of the face; i.e. the protrusion of the lips, lip corners and the jaw, did not give any additional improvement of the inversion result. The present study hence suggests that audiovisual-to-articulatory inversion can as well be performed using front view monovision of the face, rather than stereovision of both the front and profile view.
منابع مشابه
Introducing visual cues in acoustic-
The contribution of facial measures in a statistical acoustic-toarticulatory inversion has been investigated. The tongue contour was estimated using a linear estimation from either acoustics or acoustics and facial measures. Measures of the lateral movement of lip corners and the vertical movement of the upper and lower lip and the jaw gave a substantial improvement over the audio-only case. It...
متن کاملAudiovisual-to-articulatory inversion
It has been shown that acoustic-to-articulatory inversion, i.e. estimation of the articulatory configuration from the corresponding acoustic signal, can be greatly improved by adding visual features extracted from the speaker’s face. In order to make the inversion method usable in a realistic application, these features should be possible to obtain from a monocular frontal face video, where the...
متن کاملPronunciation analysis by acoustic-to-articulatory feature inversion
Second language learners may require assistance correcting their articulation of unfamiliar phonemes in order to reach the target pronunciation. If, e.g., a talking head is to provide the learner with feedback on how to change the articulation, a required first step is to be able to analyze the learner’s articulation. This paper describes how a specialized restricted acoustic-to-articulatory in...
متن کاملToward a Multi-Speaker Visual Articulatory Feedback System
In this paper, we present recent developments on the HMMbased acoustic-to-articulatory inversion approch that we develop for a “visual articulatory feedback” system. In this approach, multi-stream phoneme HMMs are trained jointly on synchronous streams of acoustic and articulatory data, acquired by electromagnetic articulography (EMA). Acousticto-articulatory inversion is achieved in two steps....
متن کاملSpeaker adaptation of an acoustic-to-articulatory inversion model using cascaded Gaussian mixture regressions
The article presents a method for adapting a GMM-based acoustic-articulatory inversion model trained on a reference speaker to another speaker. The goal is to estimate the articulatory trajectories in the geometrical space of a reference speaker from the speech audio signal of another speaker. This method is developed in the context of a system of visual biofeedback, aimed at pronunciation trai...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005